Overview
Brought to you by YData
Dataset statistics
| Number of variables | 6 |
|---|---|
| Number of observations | 14230077 |
| Missing cells | 9 |
| Missing cells (%) | < 0.1% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 651.4 MiB |
| Average record size in memory | 48.0 B |
Variable types
| Text | 6 |
|---|
nconst has unique values | Unique |
Reproduction
| Analysis started | 2025-03-06 16:20:33.014492 |
|---|---|
| Analysis finished | 2025-03-06 16:39:00.473486 |
| Duration | 18 minutes and 27.46 seconds |
| Software version | ydata-profiling vv4.13.0 |
| Download configuration | config.json |
Variables
nconst
Text
Unique 
| Distinct | 14230077 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 108.6 MiB |
Length
| Max length | 10 |
|---|---|
| Median length | 9 |
| Mean length | 9.4086813 |
| Min length | 9 |
Unique
| Unique | 14230077 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | nm0000001 |
|---|---|
| 2nd row | nm0000002 |
| 3rd row | nm0000003 |
| 4th row | nm0000004 |
| 5th row | nm0000005 |
| Value | Count | Frequency (%) |
| nm0000014 | 1 | < 0.1% |
| nm9993719 | 1 | < 0.1% |
| nm0000001 | 1 | < 0.1% |
| nm0000002 | 1 | < 0.1% |
| nm0000003 | 1 | < 0.1% |
| nm0000004 | 1 | < 0.1% |
| nm0000005 | 1 | < 0.1% |
| nm0000006 | 1 | < 0.1% |
| nm0000007 | 1 | < 0.1% |
| nm0000008 | 1 | < 0.1% |
| Other values (14230067) | 14230067 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 16114734 | |
| n | 14230077 | |
| m | 14230077 | |
| 0 | 10350550 | |
| 3 | 10266384 | |
| 2 | 10261309 | |
| 4 | 10188605 | |
| 5 | 10156454 | |
| 6 | 10088824 | |
| 7 | 9368688 | |
| Other values (2) | 18630558 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 133886260 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 1 | 16114734 | |
| n | 14230077 | |
| m | 14230077 | |
| 0 | 10350550 | |
| 3 | 10266384 | |
| 2 | 10261309 | |
| 4 | 10188605 | |
| 5 | 10156454 | |
| 6 | 10088824 | |
| 7 | 9368688 | |
| Other values (2) | 18630558 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 133886260 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 1 | 16114734 | |
| n | 14230077 | |
| m | 14230077 | |
| 0 | 10350550 | |
| 3 | 10266384 | |
| 2 | 10261309 | |
| 4 | 10188605 | |
| 5 | 10156454 | |
| 6 | 10088824 | |
| 7 | 9368688 | |
| Other values (2) | 18630558 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 133886260 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 1 | 16114734 | |
| n | 14230077 | |
| m | 14230077 | |
| 0 | 10350550 | |
| 3 | 10266384 | |
| 2 | 10261309 | |
| 4 | 10188605 | |
| 5 | 10156454 | |
| 6 | 10088824 | |
| 7 | 9368688 | |
| Other values (2) | 18630558 |
primaryName
Text
| Distinct | 10909569 |
|---|---|
| Distinct (%) | 76.7% |
| Missing | 9 |
| Missing (%) | < 0.1% |
| Memory size | 108.6 MiB |
Length
| Max length | 105 |
|---|---|
| Median length | 78 |
| Mean length | 13.510703 |
| Min length | 1 |
Unique
| Unique | 9800087 ? |
|---|---|
| Unique (%) | 68.9% |
Sample
| 1st row | Fred Astaire |
|---|---|
| 2nd row | Lauren Bacall |
| 3rd row | Brigitte Bardot |
| 4th row | John Belushi |
| 5th row | Ingmar Bergman |
| Value | Count | Frequency (%) |
| david | 134343 | 0.5% |
| john | 126310 | 0.4% |
| michael | 125450 | 0.4% |
| james | 87766 | 0.3% |
| de | 81566 | 0.3% |
| paul | 70447 | 0.2% |
| robert | 69169 | 0.2% |
| daniel | 68968 | 0.2% |
| chris | 68383 | 0.2% |
| thomas | 62618 | 0.2% |
| Other values (2255596) | 28730123 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 19998123 | 10.4% |
| e | 16221563 | 8.4% |
| 15395075 | 8.0% | |
| n | 13143503 | 6.8% |
| i | 13048092 | 6.8% |
| r | 12025350 | 6.3% |
| o | 10493790 | 5.5% |
| l | 8836076 | 4.6% |
| s | 6983723 | 3.6% |
| t | 6212808 | 3.2% |
| Other values (198) | 69900113 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 192258216 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| a | 19998123 | 10.4% |
| e | 16221563 | 8.4% |
| 15395075 | 8.0% | |
| n | 13143503 | 6.8% |
| i | 13048092 | 6.8% |
| r | 12025350 | 6.3% |
| o | 10493790 | 5.5% |
| l | 8836076 | 4.6% |
| s | 6983723 | 3.6% |
| t | 6212808 | 3.2% |
| Other values (198) | 69900113 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 192258216 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| a | 19998123 | 10.4% |
| e | 16221563 | 8.4% |
| 15395075 | 8.0% | |
| n | 13143503 | 6.8% |
| i | 13048092 | 6.8% |
| r | 12025350 | 6.3% |
| o | 10493790 | 5.5% |
| l | 8836076 | 4.6% |
| s | 6983723 | 3.6% |
| t | 6212808 | 3.2% |
| Other values (198) | 69900113 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 192258216 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| a | 19998123 | 10.4% |
| e | 16221563 | 8.4% |
| 15395075 | 8.0% | |
| n | 13143503 | 6.8% |
| i | 13048092 | 6.8% |
| r | 12025350 | 6.3% |
| o | 10493790 | 5.5% |
| l | 8836076 | 4.6% |
| s | 6983723 | 3.6% |
| t | 6212808 | 3.2% |
| Other values (198) | 69900113 |
birthYear
Text
| Distinct | 559 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 108.6 MiB |
Length
| Max length | 4 |
|---|---|
| Median length | 2 |
| Mean length | 2.0899909 |
| Min length | 1 |
Unique
| Unique | 172 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | 1899 |
|---|---|
| 2nd row | 1924 |
| 3rd row | 1934 |
| 4th row | 1949 |
| 5th row | 1918 |
| Value | Count | Frequency (%) |
| n | 13589764 | |
| 1980 | 10261 | 0.1% |
| 1981 | 9966 | 0.1% |
| 1979 | 9877 | 0.1% |
| 1982 | 9841 | 0.1% |
| 1978 | 9740 | 0.1% |
| 1983 | 9473 | 0.1% |
| 1984 | 9450 | 0.1% |
| 1977 | 9169 | 0.1% |
| 1985 | 9111 | 0.1% |
| Other values (549) | 553425 | 3.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| \ | 13589764 | |
| N | 13589764 | |
| 1 | 719344 | 2.4% |
| 9 | 718216 | 2.4% |
| 8 | 209055 | 0.7% |
| 7 | 159791 | 0.5% |
| 6 | 137948 | 0.5% |
| 2 | 128932 | 0.4% |
| 4 | 125784 | 0.4% |
| 5 | 124575 | 0.4% |
| Other values (2) | 237558 | 0.8% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 29740731 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| \ | 13589764 | |
| N | 13589764 | |
| 1 | 719344 | 2.4% |
| 9 | 718216 | 2.4% |
| 8 | 209055 | 0.7% |
| 7 | 159791 | 0.5% |
| 6 | 137948 | 0.5% |
| 2 | 128932 | 0.4% |
| 4 | 125784 | 0.4% |
| 5 | 124575 | 0.4% |
| Other values (2) | 237558 | 0.8% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 29740731 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| \ | 13589764 | |
| N | 13589764 | |
| 1 | 719344 | 2.4% |
| 9 | 718216 | 2.4% |
| 8 | 209055 | 0.7% |
| 7 | 159791 | 0.5% |
| 6 | 137948 | 0.5% |
| 2 | 128932 | 0.4% |
| 4 | 125784 | 0.4% |
| 5 | 124575 | 0.4% |
| Other values (2) | 237558 | 0.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 29740731 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| \ | 13589764 | |
| N | 13589764 | |
| 1 | 719344 | 2.4% |
| 9 | 718216 | 2.4% |
| 8 | 209055 | 0.7% |
| 7 | 159791 | 0.5% |
| 6 | 137948 | 0.5% |
| 2 | 128932 | 0.4% |
| 4 | 125784 | 0.4% |
| 5 | 124575 | 0.4% |
| Other values (2) | 237558 | 0.8% |
deathYear
Text
| Distinct | 502 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 108.6 MiB |
Length
| Max length | 4 |
|---|---|
| Median length | 2 |
| Mean length | 2.0338623 |
| Min length | 2 |
Unique
| Unique | 175 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | 1987 |
|---|---|
| 2nd row | 2014 |
| 3rd row | \N |
| 4th row | 1982 |
| 5th row | 2007 |
| Value | Count | Frequency (%) |
| n | 13989124 | |
| 2021 | 7614 | 0.1% |
| 2022 | 7248 | 0.1% |
| 2020 | 7223 | 0.1% |
| 2023 | 7006 | < 0.1% |
| 2024 | 6291 | < 0.1% |
| 2019 | 6100 | < 0.1% |
| 2018 | 5871 | < 0.1% |
| 2016 | 5762 | < 0.1% |
| 2017 | 5742 | < 0.1% |
| Other values (492) | 182096 | 1.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| \ | 13989124 | |
| N | 13989124 | |
| 2 | 195697 | 0.7% |
| 0 | 195441 | 0.7% |
| 1 | 192286 | 0.7% |
| 9 | 161643 | 0.6% |
| 8 | 46973 | 0.2% |
| 7 | 40025 | 0.1% |
| 6 | 35297 | 0.1% |
| 4 | 33930 | 0.1% |
| Other values (2) | 62477 | 0.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 28942017 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| \ | 13989124 | |
| N | 13989124 | |
| 2 | 195697 | 0.7% |
| 0 | 195441 | 0.7% |
| 1 | 192286 | 0.7% |
| 9 | 161643 | 0.6% |
| 8 | 46973 | 0.2% |
| 7 | 40025 | 0.1% |
| 6 | 35297 | 0.1% |
| 4 | 33930 | 0.1% |
| Other values (2) | 62477 | 0.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 28942017 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| \ | 13989124 | |
| N | 13989124 | |
| 2 | 195697 | 0.7% |
| 0 | 195441 | 0.7% |
| 1 | 192286 | 0.7% |
| 9 | 161643 | 0.6% |
| 8 | 46973 | 0.2% |
| 7 | 40025 | 0.1% |
| 6 | 35297 | 0.1% |
| 4 | 33930 | 0.1% |
| Other values (2) | 62477 | 0.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 28942017 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| \ | 13989124 | |
| N | 13989124 | |
| 2 | 195697 | 0.7% |
| 0 | 195441 | 0.7% |
| 1 | 192286 | 0.7% |
| 9 | 161643 | 0.6% |
| 8 | 46973 | 0.2% |
| 7 | 40025 | 0.1% |
| 6 | 35297 | 0.1% |
| 4 | 33930 | 0.1% |
| Other values (2) | 62477 | 0.2% |
| Distinct | 23207 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 108.6 MiB |
Length
| Max length | 67 |
|---|---|
| Median length | 64 |
| Mean length | 12.197102 |
| Min length | 2 |
Unique
| Unique | 5598 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | actor,miscellaneous,producer |
|---|---|
| 2nd row | actress,soundtrack,archive_footage |
| 3rd row | actress,music_department,producer |
| 4th row | actor,writer,music_department |
| 5th row | writer,director,actor |
| Value | Count | Frequency (%) |
| n | 2784781 | |
| actor | 2517288 | |
| actress | 1615502 | 11.4% |
| miscellaneous | 822202 | 5.8% |
| producer | 487929 | 3.4% |
| camera_department | 439709 | 3.1% |
| art_department | 265522 | 1.9% |
| writer | 230860 | 1.6% |
| sound_department | 222394 | 1.6% |
| composer | 174736 | 1.2% |
| Other values (23197) | 4669154 |
Most occurring characters
| Value | Count | Frequency (%) |
| r | 19910679 | |
| e | 19888051 | |
| t | 18427783 | |
| a | 16522430 | |
| c | 12802345 | 7.4% |
| o | 11510696 | 6.6% |
| s | 10612078 | 6.1% |
| n | 7764586 | 4.5% |
| m | 7672207 | 4.4% |
| i | 7292577 | 4.2% |
| Other values (16) | 41162273 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 173565705 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| r | 19910679 | |
| e | 19888051 | |
| t | 18427783 | |
| a | 16522430 | |
| c | 12802345 | 7.4% |
| o | 11510696 | 6.6% |
| s | 10612078 | 6.1% |
| n | 7764586 | 4.5% |
| m | 7672207 | 4.4% |
| i | 7292577 | 4.2% |
| Other values (16) | 41162273 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 173565705 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| r | 19910679 | |
| e | 19888051 | |
| t | 18427783 | |
| a | 16522430 | |
| c | 12802345 | 7.4% |
| o | 11510696 | 6.6% |
| s | 10612078 | 6.1% |
| n | 7764586 | 4.5% |
| m | 7672207 | 4.4% |
| i | 7292577 | 4.2% |
| Other values (16) | 41162273 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 173565705 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| r | 19910679 | |
| e | 19888051 | |
| t | 18427783 | |
| a | 16522430 | |
| c | 12802345 | 7.4% |
| o | 11510696 | 6.6% |
| s | 10612078 | 6.1% |
| n | 7764586 | 4.5% |
| m | 7672207 | 4.4% |
| i | 7292577 | 4.2% |
| Other values (16) | 41162273 |
knownForTitles
Text
| Distinct | 5915735 |
|---|---|
| Distinct (%) | 41.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 108.6 MiB |
Length
| Max length | 43 |
|---|---|
| Median length | 42 |
| Mean length | 16.169765 |
| Min length | 2 |
Unique
| Unique | 4889807 ? |
|---|---|
| Unique (%) | 34.4% |
Sample
| 1st row | tt0072308,tt0050419,tt0027125,tt0031983 |
|---|---|
| 2nd row | tt0037382,tt0075213,tt0117057,tt0038355 |
| 3rd row | tt0057345,tt0049189,tt0056404,tt0054452 |
| 4th row | tt0072562,tt0077975,tt0080455,tt0078723 |
| 5th row | tt0050986,tt0069467,tt0050976,tt0083922 |
| Value | Count | Frequency (%) |
| n | 1621517 | 11.4% |
| tt0123338 | 8258 | 0.1% |
| tt22014400 | 7508 | 0.1% |
| tt6168110 | 6382 | < 0.1% |
| tt0441074 | 4879 | < 0.1% |
| tt0072584 | 4305 | < 0.1% |
| tt0159881 | 4067 | < 0.1% |
| tt11874658 | 3926 | < 0.1% |
| tt0479832 | 3898 | < 0.1% |
| tt4202558 | 3625 | < 0.1% |
| Other values (5915725) | 12561712 |
Most occurring characters
| Value | Count | Frequency (%) |
| t | 46545878 | |
| 0 | 22941588 | |
| 1 | 21138130 | |
| 2 | 19765024 | |
| 4 | 17139758 | 7.4% |
| 3 | 16574438 | 7.2% |
| 8 | 15936007 | 6.9% |
| 6 | 15676043 | 6.8% |
| 5 | 13796997 | 6.0% |
| 7 | 13496133 | 5.9% |
| Other values (4) | 27087002 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 230096998 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| t | 46545878 | |
| 0 | 22941588 | |
| 1 | 21138130 | |
| 2 | 19765024 | |
| 4 | 17139758 | 7.4% |
| 3 | 16574438 | 7.2% |
| 8 | 15936007 | 6.9% |
| 6 | 15676043 | 6.8% |
| 5 | 13796997 | 6.0% |
| 7 | 13496133 | 5.9% |
| Other values (4) | 27087002 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 230096998 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| t | 46545878 | |
| 0 | 22941588 | |
| 1 | 21138130 | |
| 2 | 19765024 | |
| 4 | 17139758 | 7.4% |
| 3 | 16574438 | 7.2% |
| 8 | 15936007 | 6.9% |
| 6 | 15676043 | 6.8% |
| 5 | 13796997 | 6.0% |
| 7 | 13496133 | 5.9% |
| Other values (4) | 27087002 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 230096998 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| t | 46545878 | |
| 0 | 22941588 | |
| 1 | 21138130 | |
| 2 | 19765024 | |
| 4 | 17139758 | 7.4% |
| 3 | 16574438 | 7.2% |
| 8 | 15936007 | 6.9% |
| 6 | 15676043 | 6.8% |
| 5 | 13796997 | 6.0% |
| 7 | 13496133 | 5.9% |
| Other values (4) | 27087002 |
Missing values
A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
Sample
| nconst | primaryName | birthYear | deathYear | primaryProfession | knownForTitles | |
|---|---|---|---|---|---|---|
| 0 | nm0000001 | Fred Astaire | 1899 | 1987 | actor,miscellaneous,producer | tt0072308,tt0050419,tt0027125,tt0031983 |
| 1 | nm0000002 | Lauren Bacall | 1924 | 2014 | actress,soundtrack,archive_footage | tt0037382,tt0075213,tt0117057,tt0038355 |
| 2 | nm0000003 | Brigitte Bardot | 1934 | \N | actress,music_department,producer | tt0057345,tt0049189,tt0056404,tt0054452 |
| 3 | nm0000004 | John Belushi | 1949 | 1982 | actor,writer,music_department | tt0072562,tt0077975,tt0080455,tt0078723 |
| 4 | nm0000005 | Ingmar Bergman | 1918 | 2007 | writer,director,actor | tt0050986,tt0069467,tt0050976,tt0083922 |
| 5 | nm0000006 | Ingrid Bergman | 1915 | 1982 | actress,producer,soundtrack | tt0034583,tt0038109,tt0036855,tt0038787 |
| 6 | nm0000007 | Humphrey Bogart | 1899 | 1957 | actor,producer,miscellaneous | tt0034583,tt0043265,tt0033870,tt0037382 |
| 7 | nm0000008 | Marlon Brando | 1924 | 2004 | actor,director,writer | tt0078788,tt0068646,tt0047296,tt0070849 |
| 8 | nm0000009 | Richard Burton | 1925 | 1984 | actor,producer,director | tt0061184,tt0087803,tt0059749,tt0057877 |
| 9 | nm0000010 | James Cagney | 1899 | 1986 | actor,director,producer | tt0029870,tt0031867,tt0042041,tt0034236 |
| nconst | primaryName | birthYear | deathYear | primaryProfession | knownForTitles | |
|---|---|---|---|---|---|---|
| 14230067 | nm9993709 | Lu Bevins | \N | \N | producer,director,writer | tt17717854,tt11772904,tt11772812,tt11697102 |
| 14230068 | nm9993710 | Nestor Rudnytskyy | \N | \N | \N | \N |
| 14230069 | nm9993711 | David Gluzman | \N | \N | \N | \N |
| 14230070 | nm9993712 | Corny O'Connell | \N | \N | \N | \N |
| 14230071 | nm9993713 | Sambit Mishra | \N | \N | writer,producer | tt20319332,tt27191658,tt10709066,tt15134202 |
| 14230072 | nm9993714 | Romeo del Rosario | \N | \N | animation_department,art_department | tt11657662,tt14069590,tt2455546 |
| 14230073 | nm9993716 | Essias Loberg | \N | \N | \N | \N |
| 14230074 | nm9993717 | Harikrishnan Rajan | \N | \N | cinematographer | tt8736744 |
| 14230075 | nm9993718 | Aayush Nair | \N | \N | cinematographer | tt8736744 |
| 14230076 | nm9993719 | Andre Hill | \N | \N | \N | \N |